-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Wait for #2286] [SWAP] Implement inference mode #2300
base: main
Are you sure you want to change the base?
Conversation
- Match requestMemory arguments with memory_pool - Added override keyword Signed-off-by: hyeonseok lee <[email protected]>
- To support scaled dot product on attention layer as described in paper "attention all you need" add scaled dot product property Signed-off-by: hyeonseok lee <[email protected]>
- To provide dynamic input dimension implement reinitialize function - This commit is PoC of reinitialize so many of codes are just copy & paste of initialize function. Needs to refine this commit. Signed-off-by: hyeonseok lee <[email protected]>
- Added causal mask in attention layer - Implements PicoGPT Signed-off-by: hyeonseok lee <[email protected]>
Implementing picoGPT/GPT2's Encoder in CPP using nlohman/json.hpp file so we need to add or make some path to compile json parser Signed-off-by: Donghak PARK <[email protected]>
Add PicoGPT's user input Add Comment in encoder.hpp Signed-off-by: Donghak PARK <[email protected]>
This PR includes the PicoGPT(https://github.com/jaymody/picoGPT) Android Application with NNTrainer. We only use the PicoGPT Model Binary and provides the NNTrainer implementation nnstreamer#2212. This is the Android application implementation for that PR. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>
- PoC of incremental inference - Only works if batch, channel size is 1 - For the concat layer, inference step only works if axis dimension is width axis Signed-off-by: hyeonseok lee <[email protected]>
- Each threads will copy the data with batchwise direction Signed-off-by: hyeonseok lee <[email protected]>
- Apply incremental inference to pico gpt Signed-off-by: hyeonseok lee <[email protected]>
This PR includes fixes for running GPT. Signed-off-by: jijoong.moon <[email protected]>
This pr includes some fixes to run PicoGPT with W16A16 on Android using NEON. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>
This PR includes, - Fixes to enable memory optimization - remove unnecessary memory buffer **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>
Signed-off-by: Jiho Chu <[email protected]>
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2300. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202308311949210.47621703147888-70967764de236a28f1c8ab1a1c5d83aaff745c49/. |
Signed-off-by: Jiho Chu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
7096776
to
73f568b
Compare
cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309032343410.87508106231689-73f568beb28f0012aacd1d79664903387cce7dc4/. |
Please remove generated files including *.lock |
This patch is for inference mode for swap device. It re-enable mmap feature, but writing time is controlled manually, due to the inference mode handling. Signed-off-by: Jiho Chu <[email protected]>
73f568b
to
46547d7
Compare
We generate a report if there are dangerous coding constructs in your code. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309071954050.96274399757385-14e9875d785f395a0d9a2d092d31a6aac38abcc9/report/. |
INFO: You can read if there are misspelled characters at our misspelling check report. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309071954050.96274399757385-14e9875d785f395a0d9a2d092d31a6aac38abcc9/report/. |
cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309071954050.96274399757385-14e9875d785f395a0d9a2d092d31a6aac38abcc9/. |
This patch removes unnecessary files. Signed-off-by: Jiho Chu <[email protected]>
14e9875
to
b0eed2c
Compare
We generate a report if there are dangerous coding constructs in your code. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309072105160.96397399902344-b0eed2c2a07c36c863b09b4eeb84530b8f0348a0/report/. |
INFO: You can read if there are misspelled characters at our misspelling check report. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309072105160.96397399902344-b0eed2c2a07c36c863b09b4eeb84530b8f0348a0/report/. |
cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309072105160.96397399902344-b0eed2c2a07c36c863b09b4eeb84530b8f0348a0/. |
thanks. It removed lock files. |
This patch is for inference mode for a swap device.
It adds "memory_swap_mode" property, which can handle inference mode when using swap device.
The weights are not modified while inferencing, therefore it doesn't need to swap out data to the device.
Signed-off-by: Jiho Chu [email protected]